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CN . Abstract 

O ■ We present the first streaming algorithm for counting an arbitrary hypergraph H of con- 

.^ \ stant size in a massive hypergraph G. Our algorithm can handle both edge-insertions and 

edge-deletions, and is applicable for the distributed setting. Moreover, our approach provides 
the first family of graph polynomials for the hypergraph counting problem. Because of the 
close relationship between hypergraphs and set systems, our approach may have applications in 
studying similar problems. 



1 Introduction 



The problem of counting subgraphs is one of the fundamental questions in algorithm design, and has 
various applications in analyzing the clustering and transitivity coefficients of networks, uncovering 
^ ' structural information of graphs that model biological systems, and designing graph databases. 

ly-v . While the exact counting of subgraphs of constant size is polynomial-time solvable, traditional 

Tij" \ algorithms need to store the whole graph and compute the solution in an off-line fashion, which is 

'^ ' not practical even for graphs of medium size. A modern way to deal with this problem is to design 

'^ ■ algorithms in the streaming setting, where the edges of the underlying graph come sequentially 

«^ , in an arbitrary order, and algorithms with sub-linear space are required to approximately count 

the number of occurrences of certain subgraphs. Since the first streaming algorithm by Bar-Yossef 
et al. |3|, this problem has received much attention in recent years |2l44l. Ifil. M. Illl. Il2l. Il4 1 . 

We address the subgraph counting problem for hypergraphs. Formally, we are given a sequence 
of sets si, S2; • • • in a data stream. These sets, each of which consisting of vertices of the underlying 
^_' hypergraph G, arrive sequentially and represent edges of a hypergraph G = {V^E). Moreover, 

every coming edge Cj is equipped with a sign ("-I-" or "— "), indicating that edge ej is inserted to 
or deleted from the hypergraph G. That is, we study the so-called turnstile model [151] where the 
underlying graph may change over time. For any hypergraph H of constant size, algorithms with 
sub-linear space are required to approximate the number of occurrences of H in G. 

Motivation. Hypergraphs are basic models to characterize precise relations among items of data 
sets. For the study of databases, people started to use hypergraphs to model database schemes 
since 1980s [a, l8|, and this line of research led to several well-known data storage mechanisms 
like HyperGraphDB [l|]. Besides database theory, a number of studies have shown that simple 
graphq^, representing pairwise relationships, are usually not sufficient to encode all information 
when studying social, protein, or biological networks, and suggested to use hypergraphs to model 
the real relations among the items. For illustrating this point of view, let us look at the coauthor 



^For ease of our discussion simple graphs refer to graphs where every edge consists of two vertices. 



network for example. In a coauthor network, authors are represented as vertices of a graph, and 
an edge between two authors exists iff these two persons are co-authors. This natural model misses 
the information on whether a set of three (or more) authors have been co-authored of the same 
article. Such information loss is undesirable for many applications, e.g., for detecting communities 
or clusters like all authors that worked in the same research area. Similar problems occur in 
studying biological, social, and other networks when hypergraphs are required in order to express 
the complete relation among entities 13|, [la] ■ 



Our Results & Techniques. We initiate the study of counting subgraphs in the streaming 
setting, and present the first algorithm for this problem. Although the subgraph counting problem 
is much more difficult for the case of hypergraphs and streaming algorithms were unknown even for 
the edge-insertion case prior to our work, our algorithm runs in the general turnstile model, and 
is applicable in the distributed setting. Formally, for any fixed subgraph H of constant size, our 
algorithm (1 ± e)-approximates the number of occurrences of H in G. That is, for any constant 
e G (0, 1), the output of our algorithm satisfies Z € [{1 — e) ■ #H, (1 -|- e) • #H] with probability at 
least 2/3. The main result of our paper is as follows: 

Theorem 1 (Main Result). Let G be a hypergraph of n vertices and m edges, and H a hypergraph 
of k edges and minimum degree at least 2. Then there is an algorithm to (1 =b e) -approximate the 
number of occurrences of H in G that uses O (^ ■ fJ^jr-si • log n J bits of space. The update time per 

coming edge is O {^2 ' (Jlu\'i ) • Our algorithm works in the turnstile model. 

To compare our algorithm with naive methods, note that a naive approach for counting ^H 
needs to either sample independently k vertices (if possible) or k edges from the stream. Since the 
probability of k vertices (or k edges) forming H is ^H/n (or ^H/m ), this approach needs space 
Q I n ogn \ ^^j ^ / m ogn \ ^ j-ggpgctively. Thus our algorithm has significant improvement over 
the naive approach. On the other hand, we note that for any graph G of m edges, and hypergraph 
H of k edges, the number of -ff in G can be as big as Q,{m^'^). Hence for dense graphs with 
^H = uj I m~2~ J J our algorithm achieves a (1 -|- e)-approximation in sublinear space. 

Our algorithm uses the composition of complex-valued random variables. Besides presenting 
the first hypergraph counting algorithm in the streaming setting, our approach yields a family of 
graph polynomials {pn} to count the number of hypergraph H in hypergraph G. That is, for any 
hypergraph H the polynomial pu takes hypergraph G as an argument, and the value of ph{G) 
is the number of isomorphic copies of H in G. This is the first family of graph polynomials for 
the hypergraph counting problem, and the techniques developed here may have applications in 
studying graph theory or related topics. 

Theorem 2. For any hypergraph H , there is a graph polynomial ph{-) such that for any hypergraph 
G, ph{G) G NU {0} is the number of isomorphic copies of H in G. 



Our algorithm follows the framework by Kane et al. [13] ■ For any hypergraph H of k edges, we 
maintain k variables Z^*, . . . , Zg* , and each variable Z^* corresponds to one edge in H. For every 
coming edge e in graph G, we choose one or more Z^* to update according to the value of hash 
functions. We will prove that the returned value of rii<j<A: -^e* ^^ unbiased. However, in contrast 
to the simple graph case, the algorithm for hypergraphs and the analysis is much more complicated 
due to the following reasons: 

1. In contrast to simple graphs, subgraph isomorphoism between hypergraphs is more difficult 
to handle, and hence the update procedure for every coming edge is more involved. To 



overcome this, for every coming edge e of hypergraph G that consists of d. edges, we look at £! 
permutations of {1, ...,£}, and every such permutation gives e an "orientation". Moreover, 
instead of updating every Z^* simultaneously for the simple graph case, we choose one or more 
Zg* to update. Through this, we prove that the returned value of our estimator is unbiased 
for the number of occurrences of H in G. 

2. The second difficulty for dealing with hypergraphs comes from analyzing the concentration of 
the estimator. All previous works on the subgraph counting problem, e.g. Ill.ll2l.ll4l|. indicate 



that the space requirement of the algorithm depends on the number of other subgraphs in 
the underlying graph. For instance, the space complexity of the algorithms by ll|, Il2l . Il4 | is 



essentially determined by the number of closed walks of certain length in graph G. However, 
the notion of closed walks in (non-uniform) hypergraphs is not well-defined, and hence we 
need to use alternative methods to analyze the concentration of the estimator, as well as the 
space requirement. 

Because of these differences, our generalization is non-trivial and elegant. Our result (Theorem [T| 
shows that the regularity of hyperedges in G and H does not influence the actual space complexity 
of the algorithm, and the time and space complexity of our algorithm is the same as the simple 
graph case. 

Notation. Let G = {V^ E) be a hypergraph graph. The set of vertices and edges are represented 
by V[G] and i?[G]. We assume that graph G has n vertices, and n is known in advance. Graph G 
is called a hypergraph if every edge e € E[G] is a non-empty subset of V[G], i.e. E[G] is a subset 
of the power set of V[G]. For any hypergraph G and vertex u E V[G]^ the degree of u, expressed 
by deg(M), is the number of edges that include u. Moreover, the size of edge e € E[G]^ denoted by 
size(e), is the number of vertices contained in e. 

Given two hypergraphs Hi and H2, we say that Hi is homomorphic to H2 if there is a mapping 
ip : V[Hi] ^ V[H2] such that for any set D C V[Hi], D E E[Hi] implies {ip{u) : « E £>} is in E[H2]. 
We say that Hi is isomorphic to H2 if the above function ip is a bijection. For any hypergraph H, 
the automorphism of H is an isomorphism from V[H] into ^[i:/^]. Let auto(-ff) be the number of 
automorphisms of H. For any hypergraph H, we call a subgraph Hi of G that is not necessarily 
induced an occurrence of H, if Hi is isomorphic to H. Let #{H, G) be the number of occurrences 
of H in G. 

Let S^ be a permutation group of (. elements. A fcth root of unity is any number of the form 
^2-Ki-j/k^ where ^ j < k. 

2 An Unbiased Estimator for Counting Hypergraphs 

Throughout the rest of the paper we assume that hypergraph G has n vertices and m edges, and 
hypergraph H has t vertices and k edges. For the notation, we denote vertices of G hy u,v and w, 
and vertices of H are denoted by a, b and c. For every edge e* of H, we give the vertices in e* an 
arbitrary ordering and call this oriented edge e . For simplicity and with slight abuse of notation 
we will use H to express such an oriented hypergraph. 

At a high level, our estimator maintains k complex variables Z^, e* E E[H]. These complex 
variables correspond to k edges of hypergraph H, and are set to zero initially. For every arriving 
edge e E E[G] with size(e) = £, we update every Z-^ with size(e*) = size(e) according to 

Zj{G)^Zj{G)+ Y. M^(n,(i),...,n,(,)), 

{a(l),...,a(^))GS, 



where the summation is over all possible permutations of (1, . . . ,£), and M^ : {V[G]) i— )• C can be 
computed in constant time. Hence we can rewrite Z-^ as 

eGElG] (a(l),...,a(£))eS, 

size(e)=size(e*) 

Intuitively M-^(uo-(i), . . . , u^^^f^) expresses the event to give edge e = {ui, . . . , u^ } in G an orientation 

according to a permutation (o"(l), . . . , o"(^)), and map this oriented edge e to e . When the number 
of subgraph H is asked, the algorithm outputs the real part of a ■ Y\-^ Z^, where a G R"*" is a 
scaling factor and will be determined later. 

More formally, each M^{ui, . . . , ui) is defined according to the degree of vertices in graph H and 
determined by three types of random variables Q, Xc{w) and Y{w), where c € V[H] and w £ y[G]: 
(1) Variable Q is a random rth root of unity, where r := 2* — 1. (2) For vertex c G V[H],w S ^[G], 
Xciw) is random degj:^(c)th root of unity, and for each vertex c € V^[-f^], Xc : V[G] — )■ C is chosen 
independently and uniformly at random from a family of (2i • fc)-wise independent hash functions, 
where 2t ■ k = 0{1). Variables Q and Xc (c € V^[-ff]) are chosen independently. (3) For every 
w G ^[G*], Y{w) is a random element chosen from S := {l,2,4, 8, . . . ,2*~^| as part of a 4A;-wise 
independent hash function. Variables Y(w) {w E V'iG]) and Q are chosen independently. 

Given these, for every edge e = (ci, . . . , q) we define the function M-^ as 



-i-r / _Zi!H)_\ 



M-i{ui,...,ue) 

See Estimator 1 for the formal description of the update and query procedures. 

Estimator 1 Counting #{H,G) 

Update Procedure: When an edge e = {ui, . . . , ui} E E[G] arrives, update each Z^ with size(ep = £ 



w.r.t. 



ZpiG)^Z^{G)+ Yl M^K(i),...,^.<,(,)). (1) 

' ' {ail),...,aie))&e ' 

Query Procedure: When ^{H,G) is required, output the real part of 



i! ■ auto(ii') 
where Zh{G) is defined by 



Zh{G) , (2) 



Zh{G) := J] Z^iG) . (3) 



'(?<^E[H\ 



Before analyzing the algorithm, let us briefiy discuss some properties of our algorithm. First, 
the estimator runs in the turnstile model. For simplicity we only write the update procedure for 
the edge insertion case. For every coming item that represents an edge-deletion, we replace "+" 
by "— " in ([1]). Second, our estimator works in the distributed setting, where there are several 
distributed sites, and each site receives a stream Si of hyperedges. For such settings every local site 



does the same for coming edges in the local stream Si . When the number of subgraphs is asked, 
these sites cooperate to give an approximation of i^{H,G) for the underlying graph G formed by 
|J^5j. Third, we can generalize Estimator 1 to the labelled graph case. Namely, there are labels 
for every vertex (and/or edge) in G and H, and the algorithm can count the number of isomorphic 
copies of -ff in G whose labels are the same as H's. 

3 Analysis of the Estimator 

In this section, we first prove that Zh{G) defined by ([3]) is an unbiased estimator for #{H,G). 
Then, we analyze the variance of the estimator and the space requirement of our algorithm in order 
to achieve a (1 it e)-approximation. 

We first explain the intuition behind our estimator. By ([1]) and ^ we have 



ZHiG) 









PeE[H] 


e(^E[G] {a{l),...,a{e))eSi 

size{e)=size{(?) 
e={ui,...,Uf} 


■ , ^aii}) 



(4) 



Since H has k edges, Zh{G) is a product of k terms, and each term Z^{G) is a sum over all 
possible edges e of G with size(e) = size(e*) together with all possible orientations of e. Hence, 
in the expansion of Zh{G), any A;-tuple (ei,...,efc) € E^{G) with size(ej) = size(e*) contributes 
Yli<:i<:k {size{ei)\) terms to Zh{G), and each term corresponds to a certain orientation of edges 
ei,. . . ,efc. 

Let T = (ei,...,efc) be an arbitrary orientation of (ei,...,efc), and let G^ be the graph 

induced by T. Our algorithm relies on three types of variables to test if Gt± is isomorphic to 
H. These variables play different roles, as described below, (i) For c G V[H] and w £ V[G], 
we have E[X*(t(;)] 7^ (1 ^ -i ^ degfj{c)) if and only if i = degff{c). Random variables Xc{w) 
guarantee that G^^ contributes to E[Zh{G)] only if H is surjectively homomorphic to Gt^, i.e., H 
is homomorphic to Gtj^ and \V7f\ ^ |y[ii^]|. (ii) Through function Y : V[G] —^ S, every vertex 
u € V-^ maps to a random element Y{u) in S. If \V^\ = \S\ = t, then with constant probability, 
vertices in V^ map to different t numbers in S. Otherwise, jVi^| < t and vertices in V^ cannot 
map to different t elements. Since Q is a random rth root of unity, E[(5*] 7^ (1 ^ -i ^ t) if and 
only if z = r, where r = X^^g^^. The combination of Q and Y guarantees that G^f contributes to 
'E[Zh{G)] only if graph H and G^ have the same number of vertices. Combining (i) and (ii), only 
subgraphs isomorphic to H contribute to E[Z//(G)]. 

3.1 Analysis of the First Moment 

Now we show that Zh{G) defined by ([3]) is an unbiased estimator. We first list some lemmas that 
we use in proving the main theorem. 



Lemma 3 fjlOl]). Let Xc be a randomly chosen degjj{c)th root of unity, where c € V[H]. Then, 
for any 1 < z ^ degff{c), it holds that E [X*] =1 if i = deg fj{c), and E [X*] = otherwise. 

Lemma 4 ([l2])- Let R be a primitive rth root of unity and A; G N. Lf t \ k, then J2^Zo {R^Y — ^> 
otherwise YaZIIr'^Y = 0- 



Lemma 5 ([l^])- Let Xi € Z^o "'iT'd X]i=o ^« ^ ^- Then 2* — 1 | ^j^g ^* ' ^i ^/ '^"'^ '^'^^2/ ^/ 

Xo = ■■■ = Xt-l = 1. 

Theorem 6. Let H be a hypergraph with t vertices and k edges e*, . . . ,e^. Assume that variables 
Xc{w),Y{w) (c € V[H],w S ^[G]) and Q are defined as above. Then, 

Proof. Let qi be the size of edge e* in H. Consider the expansion of Zh{G): 



zh{g)= n 



I^&EIH] 



E 



E ^i^K(i)" 



,u. 



■a{e)j 



e£E[G] (<t(1),...,<t(^))6S£ 

size(e)=size(e*) 
. e={u^,...,Ui} 



E En ^^K-.(i)'---'"^-.te)) 



Vi:size(ei)=size(e*) V^-^^ieSqi 

Hence the term corresponding to edges ei, . . . , e^ with size(ej) = size(e*) and an arbitrary orienta- 
tion (Ti, . . . , CTfc of edges ei, . . . , e^ is 



n ^if (""^M 



l€i€k 



(!))•••) "i,o-i(sizc(e*)) 



:t 



n n ^c^ (^i) Q 






(5) 



l<i<A;l^js;sizc(e*) 



where c* is the jth vertex of edge e^ , and w* is the jth vertex of edge e^ . 

Consider T = (ei , . . . , e^) with size(ej) = size(e*), where e^ is determined by e, and an arbitrary 
orientation. We show that the expectation of ([5]) is non-zero if and only if the graph induced by 
T is an occurrence of H in G. Moreover, if the expectation of (j5]) is non-zero, then its value is a 
constant. 

For a vertex c of H and a vertex w of G, let 

7^(c, -w) := |{(i,j) : c'j = c and w'j = wjl 

be the number of pairs {i,j) where the jth vertex of ef in H is c, and the jth vertex of Cj in 
r is t(7. Since every vertex c of H is incident to degj:^(c) edges, for any c E ^[-ff], it holds that 
J2weVTd l^{^->'^) — deg/^(c). By the definition of 7^, we rewrite ([5]) as 



n n-'^^ 



7^ {c,w) 



W] 






Therefore we can rewrite Zh[G) as 

ei,...efce£[G] v?!'--^''- \ceV[H]w(iV^ 

Vi:size(ei)=size(e*) VjifTiSb,, 

ei = («i,i,.--,"i,<3j ^={eiv,efc) 



W) 



J] J] Q <i-^H(=) 



where the first summation is over all A;-tuples of edges in E[G] with size(ej) = size(e*), and the 
second summation is over all possible permutations of vertices of edges ei,...,ek- By linear- 
ity of expectations of these random variables and the assumption that Xc{w) (c € y[ff],w; G 
V[G]),Y{w) {w G ^[G]) and Q have sufficient independence, we have 

nZniG)] 



ei={Mi,i 

For any T, let 



E E 

ize(e*) V^:a^ 
Ui,q^) T=(ei 



Vi:size(e,)=size(e*) ":(7,Ghg. 



n E 






1j! (C,«l) 



tf 



E 



,efe) 



n <? 



deg£f(c) 






a 



3* 







7^ (c,w) 



(u;) 



■E 



]J J] Q dcgH(c) 



(6) 



We will next show that a^ is either zero or a nonzero constant independent of T . The latter is the 

case only if Gt, the undirected hypergraph induced from edge set T, is isomorphic to hypergraph 
H. 

First, we consider the product A. Assume A ^ 0. Using the same technique as [12l . Il4| . 
we construct a homomorphism from H to Gt^ under the condition A ^ 0. Remember that: (i) 
for any c G V[H] and w G V-jj^, ^^{c^w) ^ degj^(c), and (ii) for any c G V^[-ff], w G V^ and 

^ i ^ degj:^(c), E [X*(it;)] 7^ if and only if f = degj:^(c) or i = 0. Therefore, for any fixed T 

and c G V[H], E n«,ey^ xj^^'^''^^) / if and only if 7^(c,ti;) G {0,deg^(c)} for all w. Now, 



assume that E 



-i^{c,w) 



Y[^^v^X:,^'^"^'{w)\ ^ for every c G Fii^-]. Then, 7,^(0, u;) G {0,degH(c)} for 

all c G V^[-ff], and w G V[G]. Since X^^, 77,^(0, w) = degj|^(c) for any c G ^[-ff], there exists for 
each c G V^[-ff] a unique vertex w G V^ such that 7^(0, tt;) = degj|^(c). Define (/9^ : V[H] — > V^ 
as v'jt(c) = u; for the vertex w satisfying 77=^(0,10) = Aeg}j{c). Then, ip^ is a homomorphism, 
i.e., a set {ui,...,tt£} G ^[-ff] implies {93(^1), ..., (/3(uf)} G E[GTf\. Hence, ^ 7^ implies i7 is 
homomorphic to G^f;, and by Lemma [3] we have 



1 ^ 

c(iV[H] 




= II E 

c£V[H] 


Xf^-('^)(v.^(c))" 



1 



(7) 



Second, we consider the product B. We will show that, under the condition ^ 7^ 0, Gt is an 
occurrence of H if and only if i? 7^ 0. Observe that 



E 






E 



X, 



y 



^Z^ceV[H] Z^iugV-^ 



dog£f(c) 



Case i.' Assume that Gt is an occurrence of H in G. Then, \V7f:\ = \V[H]\^ and the homomor- 
phism (p^ constructed above is a bijection and an isomorphism. This implies that 

7^(0,11;) • Y{w) 



E E 



degH(c) 



c&V{H] w&V^ 



W] 



Without loss of generality, let V^ = {wi, . . . ,wt}. By considering all possible choices olY{wi), . . . , Y{wt), 
denoted by y{wi), . . . ■,y{wt) € 5, and independence between Q and Y{w) {w G V^[G]), we have 

B=Y, E 7 (nPr[yM = yK)])-exp(^^yM] 

i=o j/(i«i),...,y(t«t)GS ^ ^ ^ 

i9:=j/(toi)H |-J/(«)t),T|i9 

^E E iG)'-(v--' 

i=0 y{«;i),...,j/(«,t)e5 ^ ^ ^ 

)9:=j/(toi)H hj/Cwt),!"!-!? 

Applying Lemma S] with R = exp (^), the second summation is zero. Hence, by Lemma O we 
have 

- E 0)*^ E 0)*^ ©'■-!■ 

?/(t«l),...,yK)6-S y(«>i),...,yK)G5 ^ ^ ^ ^ 

T\y{wi)A — hyiwt) y(wi)-\ — \-y{wt)=T 

Case 2: Assume that Gt is not an occurrence of H in G. Then, ^p^ is not a bijection, and 
trivially is not an isomorphism. Let V-?^ = {ifi, . . . , w^}, where t' < t. Then, there is a vertex 
w € Vtj^ and different b,c £ V[H], such that fT^{b) = ¥?^(c) = w. As before, we have 

L L deg^(c) = ^ ^(V'^W) • 

By LemmaO r f X^ceywi ^(¥'(c)) regardless of the choices of Y{wi), . . . , Y{wt'). Hence, 

where the last equality follows from Lemma U] with R = exp (^). 

By dZI and 1^, we have a^ = t!/t* if 997^^ is an isomorphism, and a^ = otherwise. Note that 
for every occurrence of H in G, denoted by H', there are auto(i:f ) isomorphic mappings between H' 
and H, and each such mapping (p^^ corresponds to one T together with an appropriate orientation 
of every edge. Hence, every H' is counted auto(-ff) times and 

VJ:size(ei)=size(e*) Vj;fTie%. 
ei={ui^i,...,Ui^q.) T = (ei,...,efe) 

Proof of Theorem [H By Theorem [6l we have 

#<«•«> = iridsM'E[Z«(G)l. (9) 

Expanding the right-hand side of ([9]) by the definition of the expectation, the theorem holds. D 



3.2 Analysis of the Second Moment 

Now we analyze the variance of Zh{G) and use Chebyshev's inequality to upper bound the space 
requirement of our algorithm in order to get a (1 it e)-approximation of ^{H,G). Our analysis 
relies on the following lemma about the number of subgraphs in a hypergraph. 

Lemma 7. Let G be a hypergraph with m edges, and H he a hypergraph with k edges and minimum 
degree 2. Then #{H,G) = 0{m''/^). 

Proof. We define the fractional cover ip : E[H] i— > [0,1] as ip{e) = 1/2 for every e G E[H]. Since 
the minimum degree of graph H is 2, we have X^egj;"^!^) ^ ^ ^^^ every v € ^[-f^]- Therefore the 
fractional cover number min^ \ SeG-EfHl v{^) \ ^ k/2. By Theorem 1.1 of [9], the lemma holds. D 

Theorem 8. Let G be a hypergraph with m edges, and H be a hypergraph with k edges. Random 
variables Xc{w),Y(w) (c € y[i:f],tt; € ^[G]) and Q are defined as above. Then the following 
statements hold: (1) E[Zh{G) ■ Zh{G)] = 0{m?^); (2) If the minimum degree of H is at least 2, 
then ^\Zh{G) ■ Zh{G)] = 0{m^). 



Proof. By definition we write E[Z//(G) • Zh{G)] as 



E 



=E 



Zh{G) ■ Zh{G) 

7 



E 

(Tl,...,(Ts. 

VJ:size(ei)=size(e*) ^ViifTiGS,. 
y ei = (u,;,i,...,Mi,,.) Ti=(ei,...,efe) 



E 

ei,...,efce-E[G] 



n * 



75T+(c,«)) 



c(iV[H] 



\ 


/ 


) 




/ 


v 



n Q 

c£V[H] 



__£l 

dcg_g-(c) 



\ 



E 



E 



Xc {w) 



e'^,...,e'^&E[G] o[,..Wk ^<^V[H] 

Vi:size(e^)=size(e*) Vi:cr^G§q. \w&^-ri 



/ 



n <3 



7;p>(c,u')y{™) 

2 

dog^(c) 



T2 






^2 



E 



E 

ei,...,ei,eE[G] 



E 



E 



E 



/ 









n Q 



7;j;*(c,™)-7;^{c,iu) ■y{™) 



dcg jf{c) 



V 



c<^V\H\ 



w&V? 



TiUTj 



/ 



By linearity of expectations and the condition that random variables Xc{w){c G V[H],w G ^[G*]) 
are (2f • A;)-wise independent, and Xc{c G V[H]).,Q are chosen independently, we can rewrite 



EIZh ■ Z„] as 



E 



ei,...,efce-B[G] 
Vi:size(ei)=size(e*) 



(Ti,...,(Tj. 



E 



E 



a, 



Ti,T2 



„e^Gi=;[G] 



VJ;size(e^)=size(e*) Wi-.o-'-GSq^ 



ei=(ni,i,...,ni,„J Ti={ei,...,efc) e;=(j;,,i, ...,«;,,,.) ^=(^^..._^) 



where the value of a=^ =^ is 



n E 

cGy[H] 



n7i^(c,«;)-7j^(c,«;) 



weVp 



T1UT2 



E 



n « 

. ceviH] 



dogjif{c) 



0(1). 



Smce E[Zh(G) • Zh{G)] has at most 0{m ) terms, the first statement holds. 

Now for the second statement. Remember that (i) for any c E y[H] and w G y=> =>, E[X*(ri;)] 7^ 



if and only if i is divisible by deg^{c), and (ii) for any c € V[H] and w € V-rp-.-rj.' 



it holds that 
^ ^7f^{c,w) ^ degji^(c) and ^ 7;=>(c, -w) ^ degj|^(c). Hence a7f 7f 7^ if for any c E V^[^] and 



'Ta^ 



Ti,T2 



-u; e y[G] it holds that (i) 7;^(c, u;) = 7^(c, it;), or (ii) 7;^(c, w) = degj|^(c), 7^(0,1;;) = 0, or (iii) 
7;^(c, w) = 0, 7;^(c, w) = degj:^(c). We partition V"=^ ;^ into three disjoint subsets j4, B and C 



'T2 



T1UT2 



defined by A := V^ \ V^, B := V^ \ V^, and C := V^ r\ V^. Set A, B, and C are defined 
according to the above conditions (i), (ii) and (iii). By the assumption that the minimum degree 
of H is 2, the degree of every vertex in sets A, B and C is at least 2. Since there are 0(1) different 
such H' of constant size, and for each H' of them it holds that i^{H, G) = 0{m^''^), by Lemma[7] 
we have B[Zh{G) ■ Zh{G)] = 0{m''). D 

By applying Chebyshev's inequality, we can get a (libe)-approximation by running our estimator 
in parallel and returning the average of the output of these returned values, and this implies our 
main theorem (Theorem [1]) . 

Proof of Theorem [II We run s parallel and independent copies of our estimator and take the average 
value Z* = - "^21=1 -^J! where each Zi is the output of the ith instance of the estimator. Therefore, 
E[Z*] = E[Zh{G)], and a straightforward calculation shows that 



E 



Z*Z 



\E[Z* 



1|E 

s 



Zh{G)-Zh{G) -\-E[Zh{G)] 



By Chebyshev's inequality for complex- valued random variables (see, e.g., [ij. Lemma 3]), we have 



E 



Pr[|Z*-E[Z*]| ^e- |E[Z*]|] ^ 



By the first statement of Theorem [HI we have 



Zh{G) ■ Zh{G) - E[Zh(G)] • nZniG)] 



s-e2.|E[Z^(G)]|^ 



E 



Zh{G) ■ Zh{G) - nZH{G)] ■ nZH{G)] ^ E Zh{G) ■ Zh{G) = 0{m 



By choosing s = O {-^ ■ j^jjyi) , we get 



Pr[|Z*-E[Z*]| ^e- |E[Z*]|] ^ 1/3 . 



Hence, the overall space complexity is O ( 4 



Wh? 



■ log n ) . 



D 
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